<<<<<<< HEAD ======= >>>>>>> bc8c8feb8a5faa3b2b83c4d5a4c17115526bde6f
#loading packages
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
## 
##     date, intersect, setdiff, union
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.0.5     ✓ stringr 1.4.0
## ✓ tidyr   1.1.2     ✓ forcats 0.5.0
## ✓ readr   1.4.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x lubridate::as.difftime() masks base::as.difftime()
## x lubridate::date()        masks base::date()
## x dplyr::filter()          masks stats::filter()
## x lubridate::intersect()   masks base::intersect()
## x dplyr::lag()             masks stats::lag()
## x lubridate::setdiff()     masks base::setdiff()
## x lubridate::union()       masks base::union()
library(ggridges) # for joy plots
library(plotly) 
## 
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## The following object is masked from 'package:stats':
## 
##     filter
## The following object is masked from 'package:graphics':
## 
##     layout
library(gganimate)     # for adding animation layers to ggplots
library(gifski)        # for creating the gif (don't need to load this library every time,but need it installed)
#loading data
spotify <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-01-21/spotify_songs.csv')
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   .default = col_double(),
##   track_id = col_character(),
##   track_name = col_character(),
##   track_artist = col_character(),
##   track_album_id = col_character(),
##   track_album_name = col_character(),
##   track_album_release_date = col_character(),
##   playlist_name = col_character(),
##   playlist_id = col_character(),
##   playlist_genre = col_character(),
##   playlist_subgenre = col_character()
## )
## ℹ Use `spec()` for the full column specifications.
spotify_rap <- spotify %>% 
  filter(playlist_genre == "rap")

randb <- spotify %>%
  filter(playlist_genre == "r&b") %>%
  select(-track_id, - track_album_id, -playlist_id, -playlist_name) %>%
  filter(track_popularity >= 75)

Introduction & Background

<<<<<<< HEAD

Why did we do an analysis on spotify? Why is the data significant & why should people care? In troduce the data to audience

Aside from personal interest…

How has popularity changed over time? make a line graph gif (withe a line for each genre) with date on the x axis and popularity on the y axis maybe?

=======

Why did we do an analysis on spotify? Why is the data significant & why should people care? Introduce the data to audience

Using this dataset, we hope to study to technicalities of music anbd

Aside from personal interest…

Data Collection

Data retrieved from github, (add link). https://github.com/rfordatascience/tidytuesday/blob/faca0b6bd282998693007c329e3f4b917a5fd7a8/data/2020/2020-01-21/readme.md Who collected the data and what prupose does it serve? Who funded the data collection? Any possible biases? What are teh implications of the analysis of this dataset, ethical or otherwise?

How has the popularity of genres changed over time?

genre_pop <- spotify %>%
  filter(track_popularity >= 75) %>%
    mutate(ymd_release = ymd(track_album_release_date),
         year = year(ymd_release)) %>%
  group_by(year, playlist_genre) %>%
  summarize(avg_popularity = mean(track_popularity)) %>%
  ggplot(aes(x = year, y = avg_popularity, color = playlist_genre)) +
  geom_point() +
  labs(title="Average song popularity by genre per year",
       subtitle = "Overall, as music becomes more accessible, average peopulatity across all genres is on the rise.",
       x = "",
       y = "",
       color = "Genre") +
  theme_classic()
## Warning: Problem with `mutate()` input `ymd_release`.
## ℹ  68 failed to parse.
## ℹ Input `ymd_release` is `ymd(track_album_release_date)`.
## Warning: 68 failed to parse.
## `summarise()` regrouping output by 'year' (override with `.groups` argument)
ggplotly(genre_pop)
>>>>>>> bc8c8feb8a5faa3b2b83c4d5a4c17115526bde6f
prelim_graph <- spotify %>%
  ggplot(aes(y = playlist_genre, x = track_popularity)) +
  labs(title = "Song Popularity by Genre",
       x = "", y = "",
       subtitle = "Song popularity is measured from 0-100, with higher numbers being indiciative of more popularity.\nHighest median popularities belong to pop and latin with an overall median popularity of 40",
       caption = "Alex Ismail, Malek Kaloti, Brian Lee") +
  theme_classic() + 
  theme(plot.title.position = "plot",
        plot.title = element_text(size = 20, face = "bold"),
        plot.subtitle = element_text(size = 10, face = "italic")) +
  geom_boxplot() +
  geom_vline(aes(xintercept = median(track_popularity, na.rm = TRUE)), color = "blue") 

prelim_graph
<<<<<<< HEAD

spotify %>% 
  filter(track_popularity >= 75) %>%
ggplot(aes(x = track_popularity, y = playlist_genre)) +
  labs(x = "Popularity", y = "Playlist Genre") +
  geom_density_ridges() + 
  theme_ridges()
## Picking joint bandwidth of 1.37

#get rid of axes, add a more descriptive subtitle

Data Collection

Data retrieved from github, (add link). https://github.com/rfordatascience/tidytuesday/blob/faca0b6bd282998693007c329e3f4b917a5fd7a8/data/2020/2020-01-21/readme.md Who collected the data and what prupose does it serve? Who funded the data collection? Any possible biases? What are teh implications of the analysis of this dataset, ethical or otherwise?

=======

>>>>>>> bc8c8feb8a5faa3b2b83c4d5a4c17115526bde6f

Analysis!

Rap

Rap is a particularly fascinating genre to investigate using the Spotify data to look at what traits of music have correlated with popularity as the genre has undergone several changes in audience and style. Though a relatively new genre arriving on the greater music scene in the 80s, rap has undergone a myriad of trends and style variations. Fans of old school rap from the 80s and 90s may have distaste for today’s artists like Drake and Eminem for having modernized the genre too much. Fans of modern rap may get bored of the authentic sound of artists like Run-DMC or Tupac. Are there trends that tie all of rap together as to what makes a song popular?

The first and most natural observations to make are on overarching metrics that Spotify provides. Using the descriptions provided, I was most interested on the following values in correlation to track popularity: Danceability due to rap’s heavy emphasis on rhythm and beats, Energy due to some artists’ signature style of shouting to “hype” up a crowd (ie. Lil Jon, DMX), the inverse variables of Speechiness/Instrumentalness due to other artist’s signature of rapping as fast as possible (ie. Eminem, Busta Rhymes), and Valence for the perceived association between rap and violence, drugs, and focus on other less-than-righteous topics.

## `summarise()` has grouped output by 'Stat1'. You can override using the `.groups` argument.
## Warning: Problem with `mutate()` input `Stat`.
## ℹ Unknown levels in `f`: Rounded_Energy, Rounded_Speechiness, Rounded_Instrumental, Rounded_Valence
## ℹ Input `Stat` is `fct_recode(...)`.
## ℹ The error occurred in group 1: Stat1 = "Rounded_Danceability".
## Warning: Problem with `mutate()` input `Stat`.
## ℹ Unknown levels in `f`: Rounded_Danceability, Rounded_Speechiness, Rounded_Instrumental, Rounded_Valence
## ℹ Input `Stat` is `fct_recode(...)`.
## ℹ The error occurred in group 2: Stat1 = "Rounded_Energy".
## Warning: Problem with `mutate()` input `Stat`.
## ℹ Unknown levels in `f`: Rounded_Danceability, Rounded_Energy, Rounded_Speechiness, Rounded_Valence
## ℹ Input `Stat` is `fct_recode(...)`.
## ℹ The error occurred in group 3: Stat1 = "Rounded_Instrumental".
## Warning: Problem with `mutate()` input `Stat`.
## ℹ Unknown levels in `f`: Rounded_Danceability, Rounded_Energy, Rounded_Instrumental, Rounded_Valence
## ℹ Input `Stat` is `fct_recode(...)`.
## ℹ The error occurred in group 4: Stat1 = "Rounded_Speechiness".
## Warning: Problem with `mutate()` input `Stat`.
## ℹ Unknown levels in `f`: Rounded_Danceability, Rounded_Energy, Rounded_Speechiness, Rounded_Instrumental
## ℹ Input `Stat` is `fct_recode(...)`.
## ℹ The error occurred in group 5: Stat1 = "Rounded_Valence".

spotify %>% 
  mutate(track_name_lower = str_to_lower(track_name),
         remix = str_detect(track_name_lower, "Remix"),
         feature = str_detect(track_name_lower, "feat"),
         ma_prep = remix|feature,
         ma_prep2 = replace_na(ma_prep, FALSE),
         multiple_artists = if_else(ma_prep2, true = "Multiple Artists", false = "One Artist"),
         popular = track_popularity > 75) %>% 
  group_by(multiple_artists, playlist_genre) %>% 
  summarize(prop_pop = mean(popular)*100) %>% 
  mutate(genre = fct_relevel(playlist_genre, "rap")) %>% 
  ggplot() +
  geom_col(aes(x = multiple_artists, y = prop_pop)) +
  facet_wrap(~genre) +
  labs(title = "Popularity of Songs Containing Mulitple Artists Across Genre",
       x = "", y = "Percent of Songs Popular") +
  theme_classic() + 
  theme(plot.title.position = "plot",
        plot.title = element_text(size = 20, face = "bold"),
        plot.subtitle = element_text(size = 10, face = "italic"))
## `summarise()` has grouped output by 'multiple_artists'. You can override using the `.groups` argument.

R&B

Why R&B?

In this section, I want to take a closer look at one of my favorite genres of music, R&B. I think I love it so much because it’s often good music to unwind to – it’s smooth, slow, and relaxing. I also love its versatility! R&B can fit the mood of anything from a gloomy, rainy day to a bright, sunny day. But why? What characteristics make R&B such a great genre to listen to? Using the Spotify dataset and some visualizations which look at the specific characteristics of the most popular R&B songs (songs with a popularity rating of above 75), I hope to come closer to answering these questions.

randb %>%
  group_by(track_name) %>%
  arrange(desc(track_popularity)) %>%
  head(12)
## # A tibble: 12 x 19
## # Groups:   track_name [8]
##    track_name track_artist track_popularity track_album_name track_album_rel…
##    <chr>      <chr>                   <dbl> <chr>            <chr>           
##  1 ROXANNE    Arizona Zer…               99 ROXANNE          2019-10-10      
##  2 ROXANNE    Arizona Zer…               99 ROXANNE          2019-10-10      
##  3 The Box    Roddy Ricch                98 Please Excuse M… 2019-12-06      
##  4 Memories   Maroon 5                   98 Memories         2019-09-20      
##  5 Blinding … The Weeknd                 98 Blinding Lights  2019-11-29      
##  6 Blinding … The Weeknd                 98 Blinding Lights  2019-11-29      
##  7 The Box    Roddy Ricch                98 Please Excuse M… 2019-12-06      
##  8 Tusa       KAROL G                    98 Tusa             2019-11-07      
##  9 Memories   Maroon 5                   98 Memories         2019-09-20      
## 10 Circles    Post Malone                98 Hollywood's Ble… 2019-09-06      
## 11 Don't Sta… Dua Lipa                   97 Don't Start Now  2019-10-31      
## 12 everythin… Billie Eili…               97 everything i wa… 2019-11-13      
## # … with 14 more variables: playlist_genre <chr>, playlist_subgenre <chr>,
## #   danceability <dbl>, energy <dbl>, key <dbl>, loudness <dbl>, mode <dbl>,
## #   speechiness <dbl>, acousticness <dbl>, instrumentalness <dbl>,
## #   liveness <dbl>, valence <dbl>, tempo <dbl>, duration_ms <dbl>
<<<<<<< HEAD

Above are the top 10 most popular songs in the R&B genre. We can see that all of them were released in 2019 and all categorized under my two favorite two subgenres of R&B, Urban Contemporary and Hip Pop. All of them also boast a danceability score of above 0.5, with most of them (with the exception of Maroon 5’s Memories and Billie Eilish’s everything i wanted) having energy scores of above 0.5. We can also see that across the board, all 10 songs have low speechiness and instrumentalness scores (with the exception of Billie Eilish’s everything i wanted. Interestingly, all of the songs fall within a valence of 0.2-0.6. The other characteristics are quite varied. So, for the purposes of my analysis of the R&B genre, I will only focus on the song characteristics that have clear trends across the genre – danceabiility, energy, speechiness, instrumentalness, valence,

In terms of song tempo, most of the songs have a In this section of the analysis, I will

=======

Above are the top 10 most popular songs in the R&B genre. We can see that all of them were released in 2019 and all categorized under my two favorite two subgenres of R&B, Urban Contemporary and Hip Pop. All of them also boast a danceability score of above 0.5, with most of them (with the exception of Maroon 5’s Memories and Billie Eilish’s everything i wanted) having energy scores of above 0.5. We can also see that across the board, all 10 songs have low speechiness and instrumentalness scores (with the exception of Billie Eilish’s everything i wanted. Interestingly, all of the songs fall within a valence of 0.2-0.6. The other characteristics are quite varied. So, for the purposes of my analysis of the R&B genre, I will only focus on the song characteristics that have clear trends across the genre – danceabiility, energy, speechiness, instrumentalness, and valence.

What now? Why is our analysis important?

<<<<<<< HEAD

As it becomes easier to produce and release music from one’s own bedroom and streaming platforms such as Apple Music and Spotify increasingly making music accessible to everyone, we believe our analysis has important implications which can help listeners find new songs that they like and help platforms build algorithms that give better and more relevant song recommendations to its users. Thanks to streaming platforms such as Spotify and Apple Music, small creators are also given a platform for creative release. Our analyses of pop, rap, and R&B, can also help small artists grow their own platforms to cater to the interests of specific audiences. In a time such as now when the consumption of art (whether it be in the form of movies, music, or television), is essential to one’s mental wellbeing, our analysis can help boost these efforts. By asking the question, “What makes a song in a given genre popular?” We taken a close look at

=======

As it becomes easier to produce and release music from one’s own bedroom and streaming platforms such as Apple Music and Spotify increasingly making music accessible to everyone, we believe our analysis has important implications which can help listeners find new songs that they like and help platforms build algorithms that give better and more relevant song recommendations to its users.

A disclaimer: Correlation does no equal causation.

Of course, carrelation does not equal causation. Just because the

A Conclusion

Thanks to streaming platforms such as Spotify and Apple Music, small creators are also given a platform for creative release. Our analyses of pop, rap, and R&B, can also help small artists grow their own platforms to cater to the interests of specific audiences. In a time such as now when the consumption of art (whether it be in the form of movies, music, or television), is essential to one’s mental wellbeing, our analysis can help boost these efforts. By asking the question, “What makes a song in a given genre popular?” We have taken a close look at the specific characteristics of songs with a popularity rating of 75 or higher.

>>>>>>> bc8c8feb8a5faa3b2b83c4d5a4c17115526bde6f